FTS Models

1. Introduction

In this section, I will analyze financial time series data related to the COVID-19 pandemic. The time series data will consist of stocks from companies directly involved with COVID-19 or its vaccination efforts. Specifically, I will focus on Pfizer (PFE) and BioNTech SE (BNTX) due to their significant roles in developing and distributing COVID-19 vaccines.

Pfizer and Their COVID-19 Vaccine

Pfizer, a longstanding giant in the pharmaceutical industry, played a pivotal role during the COVID-19 pandemic through the development and distribution of a highly effective vaccine. Partnering with BioNTech, a German biotechnology company, Pfizer developed one of the first mRNA-based vaccines, which was granted emergency use authorization by the FDA in December 2020. This vaccine, branded as Comirnaty, showcased a new approach in vaccine technology with remarkable efficacy rates above 90% in preventing COVID-19 infection. Pfizer’s vaccine not only marked a significant scientific milestone but also became a critical tool in global efforts to combat the pandemic, leading to substantial impacts on its stock valuation.

BioNTech SE and Their COVID-19 Vaccine

BioNTech SE, while not as widely recognized globally as Pfizer before the pandemic, quickly became a household name due to its collaboration with Pfizer in developing the Comirnaty COVID-19 vaccine. Specializing in mRNA technology, BioNTech’s innovative approach was crucial in the rapid development and success of the vaccine. The company, founded in 2008, focused on patient-specific immunotherapies for cancer before pivoting to infectious diseases, which positioned it uniquely for the swift response to COVID-19. The collaboration with Pfizer not only accelerated the vaccine’s development and distribution but also significantly enhanced BioNTech’s market profile and stock performance.

To conduct the time series analysis of the financial data from Pfizer and BioNTech, I will employ ARCH (Autoregressive Conditional Heteroskedasticity) and GARCH (Generalized Autoregressive Conditional Heteroskedasticity) models. These models are well-suited for studying financial market data as they effectively model the changing volatility, a common characteristic of stock return data. By analyzing the volatility of the returns for both PFE and BNTX stocks, we can gain insights into how market perceptions of these companies’ roles in addressing the COVID-19 crisis have influenced their stock prices over time.

2. Plotting Data

Code
pfe <- getSymbols("PFE",auto.assign = FALSE, from = "2014-01-01",src="yahoo")

chartSeries(pfe, theme = chartTheme("white"),
            bar.type = "hlc",  
            up.col = "green",  
            dn.col = "red") 

The plot is a candlestick chart of Pfizer’s stock price from January 2014 to April 2024. Over this decade, the stock price shows a general upward trend until early 2021, with some volatility but consistently higher peaks and troughs, indicating growth over time. Notably, there is a sharp increase in stock price beginning around 2020, which could correlate with the development and deployment of Pfizer’s COVID-19 vaccine. After reaching a peak in early 2021, the stock exhibits a significant decline. The high trading volume spike at the beginning of 2021 suggests a period of high investor interest, potentially due to vaccine-related news. Following this, the price begins a consistent downward trend, falling to levels below the pre-2020 range. This decline could be indicative of market saturation, vaccine rollout completion, or shifting investor sentiment. As of the last recorded point, the stock price stands at approximately $26.78, which is lower than the previous years’ highs, highlighting a notable retraction from the peak seen during the height of the pandemic response.

Code
bntx <- getSymbols("BNTX",auto.assign = FALSE, from = "2014-01-01",src="yahoo")

chartSeries(bntx, theme = chartTheme("white"),
            bar.type = "hlc",  
            up.col = "green",  
            dn.col = "red") 

The chart depicts the stock price movement for BioNTech SE from October 2019 to April 2024, which shows a remarkable volatility pattern. Initially, the stock price shows a gradual upward trend, but then experiences a significant surge starting around 2020, likely in response to its role in COVID-19 vaccine development and anticipation of profits from the vaccine. This surge peaks sharply, reaching the highest point in the observed period, indicative of intense investor optimism. Following this peak, the stock enters a period of decline and heightened volatility, eventually stabilizing but at a much lower price level than the peak. By April 2024, the price has settled at around $90, which, while lower than the high, still represents a substantial increase from the early days of trading. The trading volume shows several spikes, the largest of which coincides with the peak stock price, suggesting periods of significant investor activity possibly driven by news events or milestones in the company’s vaccine development and sales. The general trend, after the initial peak, shows that the excitement and speculation that drove the initial rise have tempered, leading to a more stable but lower stock price as the market adjusts to the new norm post-vaccine rollout.

3. Model Fitting

Code
pfets <- ts(pfe$PFE.Adjusted, start=decimal_date(as.Date("2014-01-02")), frequency = 365.25)

returns_pfe = log(pfets) %>% diff()
summary(returns_pfe)
  PFE.Adjusted       
 Min.   :-0.0805015  
 1st Qu.:-0.0067332  
 Median : 0.0000000  
 Mean   : 0.0001075  
 3rd Qu.: 0.0069172  
 Max.   : 0.1030546  
Code
autoplot(returns_pfe, color="#5a3196") + theme_bw() +ggtitle("Pfizer Stock Price Returns")

Returns fluctuate above and below a return of zero, with no clear long-term trend suggesting sustained periods of significant gains or losses. The plot indicates a high level of volatility, as seen by the spikes particularly in the later years around 2020, which could be associated with market reactions to the development and distribution of Pfizer’s COVID-19 vaccine. The fact that the returns oscillate without a clear directional trend implies that while there are periods of positive and negative returns, they tend to offset each other over this timeframe, suggesting a riskier investment during certain periods and the potential for short-term gains or losses.

Code
bntxts <- ts(bntx$BNTX.Adjusted, start=decimal_date(as.Date("2019-10-10")), frequency = 365.25)

returns_bntx = log(bntxts) %>% diff()
summary(returns_bntx)
 BNTX.Adjusted       
 Min.   :-0.4391793  
 1st Qu.:-0.0217427  
 Median : 0.0000811  
 Mean   : 0.0016000  
 3rd Qu.: 0.0233698  
 Max.   : 0.5098251  
Code
autoplot(returns_bntx, color="#5a3196") + theme_bw() +ggtitle("BioNTech SE Stock Price Returns")

The plot displays the daily stock price returns for BioNTech SE. Notably, there are periods of extreme volatility, as seen by the tall spikes, especially in early 2020, which likely correspond to the initial phases of the COVID-19 pandemic and the associated developments in BioNTech’s vaccine. As time progresses, the volatility seems to decrease somewhat, with the fluctuations in returns becoming less extreme, suggesting that the market has started to stabilize after the initial shocks. However, even with reduced volatility, the returns still show no consistent trend upward or downward over the entire time frame, indicating that while the extreme uncertainty may have subsided, the stock still experiences regular shifts in investor sentiment and market reaction to new information.

4. ACF and PACF Plots

4.1 Returns

Code
ggAcf(returns_pfe,40) + theme_bw()

Code
ggPacf(returns_pfe,40) + theme_bw()

We can see for both ACF anf PACF plots, the Pfizer data is very weakly stationary. The possible q and p values would both be q and p = 3,9,10.

Code
ggAcf(returns_bntx,40) + theme_bw()

Code
ggPacf(returns_bntx,40) + theme_bw()

We can see for both ACF anf PACF plots, the BioNTech SE data is very weakly stationary. The possible q and p values would both be q and p = 9,32. Most of these are again too high, so we would have to likely consider p and q 0,9.

4.2 Absolute Returns

Code
ggAcf(abs(returns_pfe),40) + theme_bw()

Code
ggPacf(abs(returns_pfe),40) + theme_bw()

The analysis of absolute returns for Pfizer stock reveals a stable pattern in the ACF plot, with the first 40 lags exhibiting significant autocorrelation. This suggests a strong and consistent relationship in the data’s movements over these intervals. Conversely, the PACF plot indicates a more contained range of significant lags. Here, only the first 8, along with the 11th and 15th lags, show significant partial autocorrelation.

Code
ggAcf(abs(returns_bntx),40) + theme_bw()

Code
ggPacf(abs(returns_bntx),40) + theme_bw()

The analysis of absolute returns for BioNTech SE stock reveals a stable pattern in the ACF plot, with the first 40 lags exhibiting significant autocorrelation. This suggests a strong and consistent relationship in the data’s movements over these intervals. Conversely, the PACF plot indicates a more contained range of significant lags. Here, only the first 3, along with the 5th, 8th and 9th lags, show significant partial autocorrelation.

4.3 Squared Returns

Code
ggAcf(returns_pfe^2,40) + theme_bw()

Code
ggPacf(returns_pfe^2,40) + theme_bw()

Squared returns are less stationary than absolute returns but more stationary than only returns. For Pfizer ACF plot, significant lags are 1 through 21 and PACF plot had significant lags are 1 to 5, 8, 11, and 14.

Code
ggAcf(returns_bntx^2,40) + theme_bw()

Code
ggPacf(returns_bntx^2,40) + theme_bw()

Squared returns are less stationary than absolute returns but more stationary than only returns. For BioNTech SE ACF plot, significant lags are 1 through 7 and PACF plot had significant lags are 1 through 6.

5. ARCH Test

In this section, the ARCH Lagrange Multiplier (LM) test will be employed to examine the returns data for autoregressive conditional heteroskedasticity (ARCH) effects. The null hypothesis of the LM test posits the absence of ARCH effects in the time series data. By setting the significance threshold at 0.05, if the test statistics fall beyond this critical value, we will have sufficient evidence to reject the null hypothesis. Should this occur, it would indicate that an ARCH(1) effect is present, suggesting that the data exhibits volatility clustering, where periods of high volatility are followed by high volatility, and periods of low volatility are followed by low volatility.

Code
ArchTest(returns_pfe, lags=1, demean=TRUE)

    ARCH LM-test; Null hypothesis: no ARCH effects

data:  returns_pfe
Chi-squared = 176.72, df = 1, p-value < 2.2e-16

The result of the ARCH LM test for Pfizer’s return data indicates a Chi-squared statistic of 176.57 with 1 degree of freedom. The p-value is less than 2.2e-16, which is substantially below the 0.05 significance level. So we can reject the null hypothesis that there are no ARCH effects in the returns data. The extremely low p-value suggests that the presence of ARCH effects is highly statistically significant. This finding has important implications for financial modeling and risk management, indicating that the assumption of constant volatility is not suitable for this time series data.

Code
ArchTest(returns_bntx, lags=1, demean=TRUE)

    ARCH LM-test; Null hypothesis: no ARCH effects

data:  returns_bntx
Chi-squared = 329.26, df = 1, p-value < 2.2e-16

The result of the ARCH LM test for BioNTech SE’s return data indicates a Chi-squared statistic of 326.66 with 1 degree of freedom. The p-value is less than 2.2e-16, which is substantially below the 0.05 significance level. So we can reject the null hypothesis that there are no ARCH effects in the returns data. The extremely low p-value suggests that the presence of ARCH effects is highly statistically significant. This finding has important implications for financial modeling and risk management, indicating that the assumption of constant volatility is not suitable for this time series data.

6. ARIMA Model Fitting

Having examined the ACF and the PACF plots, the next step in the analysis involves constructing an ARIMA model. This model will be tailored based on the insights gathered from the ACF/PACF plots, with the aim to capture the underlying patterns in the return series. Subsequently, I will apply a GARCH (Generalized Autoregressive Conditional Heteroskedasticity) model to the residuals of the ARIMA model. This two-step methodology leverages the ARIMA model to address the serial correlation in the data, and the GARCH model to capture the volatility clustering often observed in financial time series. This sequential approach will provide a comprehensive understanding of the dynamics at play in the returns of the stock.

Using the Pfizer returns, I will find the best performing ARIMA model according to AIC. Recall that the lag values for this were p = 1 to 5 d = 0, 1 and q = 1 to 5. Since the ACF/PACF plots mostly stationarity, there is no need to difference the data.

Code
ARIMA.c=function(p1,p2,q1,q2,data){
temp=c()
d=1
i=1
temp= data.frame()
ls=matrix(rep(NA,6*50),nrow=50)


for (p in p1:p2)#
{
  for(q in q1:q2)#
  {
    for(d in 0:1)#
    {
      
      if(p+d+q<=6)
      {
        
        model<- Arima(data,order=c(p,d,q))
        ls[i,]= c(p,d,q,model$aic,model$bic,model$aicc)
        i=i+1
  
        
      }
      
    }
  }
}


temp= as.data.frame(ls)
names(temp)= c("p","d","q","AIC","BIC","AICc")

temp
}

output <- ARIMA.c(1,5,1,5,data=returns_pfe) #just using log prices
output
    p  d  q       AIC       BIC      AICc
1   1  0  1 -14687.54 -14664.10 -14687.53
2   1  1  1 -14675.01 -14657.43 -14675.00
3   1  0  2 -14685.56 -14656.26 -14685.54
4   1  1  2 -14671.91 -14648.47 -14671.89
5   1  0  3 -14708.15 -14672.99 -14708.11
6   1  1  3 -14690.32 -14661.02 -14690.30
7   1  0  4 -14706.40 -14665.38 -14706.36
8   1  1  4 -14693.64 -14658.49 -14693.61
9   1  0  5 -14705.02 -14658.14 -14704.97
10  2  0  1 -14685.58 -14656.28 -14685.56
11  2  1  1 -14673.06 -14649.63 -14673.05
12  2  0  2 -14685.20 -14650.04 -14685.17
13  2  1  2 -14691.33 -14662.03 -14691.31
14  2  0  3 -14713.86 -14672.84 -14713.82
15  2  1  3 -14684.45 -14649.29 -14684.42
16  2  0  4 -14704.19 -14657.31 -14704.13
17  3  0  1 -14708.74 -14673.58 -14708.70
18  3  1  1 -14676.04 -14646.74 -14676.02
19  3  0  2 -14706.67 -14665.65 -14706.63
20  3  1  2 -14694.24 -14659.08 -14694.20
21  3  0  3 -14711.71 -14664.83 -14711.66
22  4  0  1 -14706.94 -14665.92 -14706.89
23  4  1  1 -14676.38 -14641.22 -14676.35
24  4  0  2 -14704.75 -14657.88 -14704.70
25  5  0  1 -14705.98 -14659.11 -14705.93
26 NA NA NA        NA        NA        NA
27 NA NA NA        NA        NA        NA
28 NA NA NA        NA        NA        NA
29 NA NA NA        NA        NA        NA
30 NA NA NA        NA        NA        NA
31 NA NA NA        NA        NA        NA
32 NA NA NA        NA        NA        NA
33 NA NA NA        NA        NA        NA
34 NA NA NA        NA        NA        NA
35 NA NA NA        NA        NA        NA
36 NA NA NA        NA        NA        NA
37 NA NA NA        NA        NA        NA
38 NA NA NA        NA        NA        NA
39 NA NA NA        NA        NA        NA
40 NA NA NA        NA        NA        NA
41 NA NA NA        NA        NA        NA
42 NA NA NA        NA        NA        NA
43 NA NA NA        NA        NA        NA
44 NA NA NA        NA        NA        NA
45 NA NA NA        NA        NA        NA
46 NA NA NA        NA        NA        NA
47 NA NA NA        NA        NA        NA
48 NA NA NA        NA        NA        NA
49 NA NA NA        NA        NA        NA
50 NA NA NA        NA        NA        NA
Code
output[which.min(output$AIC),] 
   p d q       AIC       BIC      AICc
14 2 0 3 -14713.86 -14672.84 -14713.82
Code
output[which.min(output$BIC),] 
   p d q       AIC       BIC     AICc
17 3 0 1 -14708.74 -14673.58 -14708.7
Code
output[which.min(output$AICc),] 
   p d q       AIC       BIC      AICc
14 2 0 3 -14713.86 -14672.84 -14713.82

The best ARIMA model according to the lowest AIC is ARIMA(3,0,2)

Code
fit_pfe <- Arima(returns_pfe,order=c(3,0,2))
summary(fit_pfe)
Series: returns_pfe 
ARIMA(3,0,2) with non-zero mean 

Coefficients:
          ar1     ar2     ar3     ma1      ma2   mean
      -0.8462  0.0045  0.0485  0.8253  -0.0325  1e-04
s.e.   1.2263  1.1206  0.0708  1.2291   1.1049  3e-04

sigma^2 = 2e-04:  log likelihood = 7360.33
AIC=-14706.67   AICc=-14706.63   BIC=-14665.65

Training set error measures:
                        ME       RMSE         MAE MPE MAPE      MASE
Training set -6.146671e-06 0.01412681 0.009873244 NaN  Inf 0.6517835
                    ACF1
Training set 0.000395289

Several of the coefficients within the model display significance levels exceeding the conventional threshold of 0.05. This suggests that there is insufficient evidence to confidently assert that these coefficients have a meaningful impact on the dependent variable within the context of this model.

Using the BioNTech SE returns, I will find the best performing ARIMA model according to AIC. Recall that the lag values for this were p = 1 to 5 d = 0, 1 and q = 1 to 5. Since the ACF/PACF plots mostly stationarity, there is no need to difference the data.

Code
output <- ARIMA.c(1,5,1,5,data=returns_bntx) #just using log prices
output
    p  d  q       AIC       BIC      AICc
1   1  0  1 -3461.159 -3441.011 -3461.124
2   1  1  1 -3452.872 -3437.764 -3452.851
3   1  0  2 -3465.838 -3440.652 -3465.785
4   1  1  2 -3451.908 -3431.763 -3451.873
5   1  0  3 -3464.173 -3433.951 -3464.099
6   1  1  3 -3450.009 -3424.828 -3449.956
7   1  0  4 -3462.216 -3426.957 -3462.117
8   1  1  4 -3450.912 -3420.695 -3450.838
9   1  0  5 -3460.325 -3420.028 -3460.197
10  2  0  1 -3466.013 -3440.828 -3465.960
11  2  1  1 -3454.853 -3434.708 -3454.818
12  2  0  2 -3464.110 -3433.888 -3464.036
13  2  1  2 -3457.779 -3432.599 -3457.726
14  2  0  3 -3462.171 -3426.911 -3462.071
15  2  1  3 -3454.809 -3424.592 -3454.735
16  2  0  4 -3460.198 -3419.902 -3460.071
17  3  0  1 -3464.130 -3433.908 -3464.056
18  3  1  1 -3456.081 -3430.900 -3456.028
19  3  0  2 -3464.136 -3428.877 -3464.037
20  3  1  2 -3455.954 -3425.737 -3455.879
21  3  0  3 -3460.326 -3420.030 -3460.199
22  4  0  1 -3462.156 -3426.897 -3462.057
23  4  1  1 -3454.535 -3424.318 -3454.460
24  4  0  2 -3460.169 -3419.873 -3460.042
25  5  0  1 -3460.543 -3420.247 -3460.416
26 NA NA NA        NA        NA        NA
27 NA NA NA        NA        NA        NA
28 NA NA NA        NA        NA        NA
29 NA NA NA        NA        NA        NA
30 NA NA NA        NA        NA        NA
31 NA NA NA        NA        NA        NA
32 NA NA NA        NA        NA        NA
33 NA NA NA        NA        NA        NA
34 NA NA NA        NA        NA        NA
35 NA NA NA        NA        NA        NA
36 NA NA NA        NA        NA        NA
37 NA NA NA        NA        NA        NA
38 NA NA NA        NA        NA        NA
39 NA NA NA        NA        NA        NA
40 NA NA NA        NA        NA        NA
41 NA NA NA        NA        NA        NA
42 NA NA NA        NA        NA        NA
43 NA NA NA        NA        NA        NA
44 NA NA NA        NA        NA        NA
45 NA NA NA        NA        NA        NA
46 NA NA NA        NA        NA        NA
47 NA NA NA        NA        NA        NA
48 NA NA NA        NA        NA        NA
49 NA NA NA        NA        NA        NA
50 NA NA NA        NA        NA        NA
Code
output[which.min(output$AIC),] 
   p d q       AIC       BIC     AICc
10 2 0 1 -3466.013 -3440.828 -3465.96
Code
output[which.min(output$BIC),] 
  p d q       AIC       BIC      AICc
1 1 0 1 -3461.159 -3441.011 -3461.124
Code
output[which.min(output$AICc),] 
   p d q       AIC       BIC     AICc
10 2 0 1 -3466.013 -3440.828 -3465.96

The best ARIMA model according to the lowest AIC is ARIMA(2,0,1)

Code
fit_bntx <- Arima(returns_bntx,order=c(2,0,1))
summary(fit_bntx)
Series: returns_bntx 
ARIMA(2,0,1) with non-zero mean 

Coefficients:
         ar1     ar2      ma1    mean
      0.6724  -0.072  -0.6596  0.0016
s.e.  0.1706   0.031   0.1695  0.0013

sigma^2 = 0.00277:  log likelihood = 1738.01
AIC=-3466.01   AICc=-3465.96   BIC=-3440.83

Training set error measures:
                       ME       RMSE        MAE MPE MAPE      MASE
Training set 3.641115e-06 0.05253933 0.03461972 NaN  Inf 0.6555673
                      ACF1
Training set -0.0004177628

Several of the coefficients within the model display significance levels exceeding the conventional threshold of 0.05. This suggests that there is insufficient evidence to confidently assert that these coefficients have a meaningful impact on the dependent variable within the context of this model.

7. Residuals Analysis

Code
sarima(returns_pfe,3,0,2)
initial  value -4.253848 
iter   2 value -4.254771
iter   3 value -4.255216
iter   4 value -4.255236
iter   5 value -4.255238
iter   6 value -4.255248
iter   7 value -4.255276
iter   8 value -4.255389
iter   9 value -4.256370
iter  10 value -4.256700
iter  11 value -4.257065
iter  12 value -4.257395
iter  13 value -4.257474
iter  14 value -4.257873
iter  15 value -4.258263
iter  16 value -4.258660
iter  17 value -4.258744
iter  18 value -4.258813
iter  19 value -4.258817
iter  20 value -4.258818
iter  21 value -4.258818
iter  22 value -4.258820
iter  23 value -4.258820
iter  24 value -4.258820
iter  25 value -4.258821
iter  26 value -4.258822
iter  27 value -4.258824
iter  28 value -4.258826
iter  29 value -4.258828
iter  30 value -4.258833
iter  31 value -4.258841
iter  32 value -4.258860
iter  33 value -4.258893
iter  34 value -4.258946
iter  35 value -4.258985
iter  36 value -4.258985
iter  37 value -4.258987
iter  38 value -4.258988
iter  39 value -4.258989
iter  40 value -4.258989
iter  41 value -4.258989
iter  42 value -4.258990
iter  43 value -4.258990
iter  43 value -4.258990
iter  43 value -4.258990
final  value -4.258990 
converged
initial  value -4.259670 
iter   1 value -4.259670
final  value -4.259670 
converged
<><><><><><><><><><><><><><>
 
Coefficients: 
      Estimate     SE t.value p.value
ar1    -0.8462 1.2263 -0.6900  0.4902
ar2     0.0045 1.1206  0.0040  0.9968
ar3     0.0485 0.0708  0.6850  0.4934
ma1     0.8253 1.2291  0.6714  0.5020
ma2    -0.0325 1.1049 -0.0294  0.9765
xmean   0.0001 0.0003  0.4098  0.6820

sigma^2 estimated as 0.0001995668 on 2585 degrees of freedom 
 
AIC = -5.676059  AICc = -5.676046  BIC = -5.660228 
 

Examining the Standardized Residuals plot from the model, we observe notable fluctuations remaining. Evidently, peaks of volatility emerge, notably around the years 2015, 2018, and 2020. To validate this observation, we’ll further investigate by plotting the ACF and PACF of the residuals.

Code
pfe.res<-fit_pfe$residuals

plot1<-ggAcf(pfe.res, 40) + theme_bw()+
      theme(plot.background = element_rect(fill = "#D9E3F1", color = NA),
            panel.background = element_rect(fill = "#D9E3F1", color = NA)) 
plot2<- ggPacf(pfe.res, 40)+theme_bw()+
      theme(plot.background = element_rect(fill = "#D9E3F1", color = NA),
            panel.background = element_rect(fill = "#D9E3F1", color = NA)) 

grid.arrange(plot1, plot2,nrow=2)

Code
plot1<-ggAcf(pfe.res^2, 40) + theme_bw()+
      theme(plot.background = element_rect(fill = "#D9E3F1", color = NA),
            panel.background = element_rect(fill = "#D9E3F1", color = NA)) 
plot2<- ggPacf(pfe.res^2, 40)+theme_bw()+
      theme(plot.background = element_rect(fill = "#D9E3F1", color = NA),
            panel.background = element_rect(fill = "#D9E3F1", color = NA)) 

grid.arrange(plot1, plot2,nrow=2)

It seems there is a robust correlation observed at lag 1, 2, 3, 4, and beyond, suggesting that additional correlation may exist and could be captured by employing a GARCH model on the residuals. Utilizing a GARCH model becomes imperative in this scenario.

Code
sarima(returns_bntx,2,0,1)
initial  value -2.943072 
iter   2 value -2.944640
iter   3 value -2.944721
iter   4 value -2.944776
iter   5 value -2.944779
iter   6 value -2.944883
iter   7 value -2.944986
iter   8 value -2.945092
iter   9 value -2.945187
iter  10 value -2.945318
iter  11 value -2.945366
iter  12 value -2.945454
iter  13 value -2.945522
iter  14 value -2.945547
iter  15 value -2.945548
iter  16 value -2.945549
iter  17 value -2.945552
iter  18 value -2.945559
iter  19 value -2.945571
iter  20 value -2.945578
iter  21 value -2.945579
iter  22 value -2.945579
iter  23 value -2.945579
iter  24 value -2.945580
iter  25 value -2.945581
iter  26 value -2.945582
iter  27 value -2.945583
iter  28 value -2.945583
iter  29 value -2.945583
iter  30 value -2.945583
iter  31 value -2.945584
iter  32 value -2.945584
iter  32 value -2.945584
iter  32 value -2.945584
final  value -2.945584 
converged
initial  value -2.945680 
iter   2 value -2.945685
iter   3 value -2.945804
iter   4 value -2.945857
iter   5 value -2.945886
iter   6 value -2.945930
iter   7 value -2.946035
iter   8 value -2.946087
iter   9 value -2.946152
iter  10 value -2.946175
iter  11 value -2.946176
iter  12 value -2.946176
iter  13 value -2.946177
iter  14 value -2.946181
iter  15 value -2.946183
iter  16 value -2.946184
iter  17 value -2.946185
iter  18 value -2.946185
iter  19 value -2.946185
iter  19 value -2.946185
final  value -2.946185 
converged
<><><><><><><><><><><><><><>
 
Coefficients: 
      Estimate     SE t.value p.value
ar1     0.6724 0.1706  3.9403  0.0001
ar2    -0.0720 0.0310 -2.3232  0.0203
ma1    -0.6596 0.1695 -3.8926  0.0001
xmean   0.0016 0.0013  1.2047  0.2286

sigma^2 estimated as 0.002760381 on 1134 degrees of freedom 
 
AIC = -3.045706  AICc = -3.045675  BIC = -3.023575 
 

Examining the Standardized Residuals plot from the model, we observe notable fluctuations remaining. Evidently, peaks of volatility emerge, notably around the year 2020. To validate this observation, we’ll further investigate by plotting the ACF and PACF of the residuals.

Code
bntx.res<-fit_bntx$residuals

plot1<-ggAcf(bntx.res, 40) + theme_bw()+
      theme(plot.background = element_rect(fill = "#D9E3F1", color = NA),
            panel.background = element_rect(fill = "#D9E3F1", color = NA)) 
plot2<- ggPacf(bntx.res, 40)+theme_bw()+
      theme(plot.background = element_rect(fill = "#D9E3F1", color = NA),
            panel.background = element_rect(fill = "#D9E3F1", color = NA)) 

grid.arrange(plot1, plot2,nrow=2)

Code
plot1<-ggAcf(bntx.res^2, 40) + theme_bw()+
      theme(plot.background = element_rect(fill = "#D9E3F1", color = NA),
            panel.background = element_rect(fill = "#D9E3F1", color = NA)) 
plot2<- ggPacf(bntx.res^2, 40)+theme_bw()+
      theme(plot.background = element_rect(fill = "#D9E3F1", color = NA),
            panel.background = element_rect(fill = "#D9E3F1", color = NA)) 

grid.arrange(plot1, plot2,nrow=2)

It seems there is a robust correlation observed at lag 1, 2, 3, and 4, suggesting that additional correlation may exist and could be captured by employing a GARCH model on the residuals. Utilizing a GARCH model becomes imperative in this scenario.

8. Method 1: GARCH Model on ARIMA Residuals

The preceding section revealed that conventional ARIMA models failed to adequately capture the correlation within volatility clusters for both Pfizer and BioNTech SE returns. Despite employing ARIMA modeling techniques, the examination of standardized residuals and autocorrelation (ACF/PACF) plots unveiled substantial remaining correlation. Consequently, I intend to apply a GARCH model to the residuals and identify the optimal model based on the AIC.

Code
model <- list() ## set counter
cc <- 1
for (p in 1:7) {
  for (q in 1:7) {
  
model[[cc]] <- garch(pfe.res,order=c(q,p),trace=F)
cc <- cc + 1
}
} 

## get AIC values for model evaluation
GARCH_AIC <- sapply(model, AIC) ## model with lowest AIC is the best
which(GARCH_AIC == min(GARCH_AIC))
[1] 2

The best model according to AIC is:

Code
model[[which(GARCH_AIC == min(GARCH_AIC))]]

Call:
garch(x = pfe.res, order = c(q, p), trace = F)

Coefficient(s):
       a0         a1         b1         b2  
6.877e-06  1.476e-01  5.693e-01  2.568e-01  

The most optimal model, indicated by the smallest AIC, is the GARCH(1,1) model. In the subsequent code, I will delve into the diagnostics of this top-performing model from the previous section. Should the evaluation metrics, including significance levels, fall short of expectations, I will consider training an alternative GARCH model.

Code
summary(garchFit(~garch(1,1), pfe.res,trace = F))

Title:
 GARCH Modelling 

Call:
 garchFit(formula = ~garch(1, 1), data = pfe.res, trace = F) 

Mean and Variance Equation:
 data ~ garch(1, 1)
<environment: 0x11ec71190>
 [data = pfe.res]

Conditional Distribution:
 norm 

Coefficient(s):
        mu       omega      alpha1       beta1  
6.1467e-05  6.2071e-06  1.2562e-01  8.5028e-01  

Std. Errors:
 based on Hessian 

Error Analysis:
        Estimate  Std. Error  t value Pr(>|t|)    
mu     6.147e-05   2.211e-04    0.278    0.781    
omega  6.207e-06   1.389e-06    4.469 7.86e-06 ***
alpha1 1.256e-01   1.652e-02    7.602 2.91e-14 ***
beta1  8.503e-01   1.884e-02   45.141  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 7634.289    normalized:  2.946464 

Description:
 Mon Apr 22 20:35:34 2024 by user:  


Standardised Residuals Tests:
                                  Statistic    p-Value
 Jarque-Bera Test   R    Chi^2  1197.863107 0.00000000
 Shapiro-Wilk Test  R    W         0.968343 0.00000000
 Ljung-Box Test     R    Q(10)    13.956483 0.17498582
 Ljung-Box Test     R    Q(15)    15.348289 0.42663242
 Ljung-Box Test     R    Q(20)    21.055158 0.39389243
 Ljung-Box Test     R^2  Q(10)    19.889346 0.03031683
 Ljung-Box Test     R^2  Q(15)    26.134182 0.03663573
 Ljung-Box Test     R^2  Q(20)    26.902966 0.13801612
 LM Arch Test       R    TR^2     22.470682 0.03257096

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-5.889841 -5.880794 -5.889845 -5.886562 

All of the coefficients are significant for GARCH(1,1). Thus the best model according to method 1 for the Pfizer returns is ARIMA(3,0,2) + GARCH(1,1).

Code
model <- list() ## set counter
cc <- 1
for (p in 1:7) {
  for (q in 1:7) {
  
model[[cc]] <- garch(bntx.res,order=c(q,p),trace=F)
cc <- cc + 1
}
} 

## get AIC values for model evaluation
GARCH_AIC <- sapply(model, AIC) ## model with lowest AIC is the best
which(GARCH_AIC == min(GARCH_AIC))
[1] 1

The best model according to AIC is:

Code
model[[which(GARCH_AIC == min(GARCH_AIC))]]

Call:
garch(x = bntx.res, order = c(q, p), trace = F)

Coefficient(s):
       a0         a1         b1  
1.984e-05  1.282e-01  8.729e-01  

The most optimal model, indicated by the smallest AIC, is the GARCH(6,7) model. In the subsequent code, I will delve into the diagnostics of this top-performing model from the previous section. Should the evaluation metrics, including significance levels, fall short of expectations, I will consider training an alternative GARCH model.

Code
summary(garchFit(~garch(6,7), bntx.res,trace = F))

Title:
 GARCH Modelling 

Call:
 garchFit(formula = ~garch(6, 7), data = bntx.res, trace = F) 

Mean and Variance Equation:
 data ~ garch(6, 7)
<environment: 0x12f522980>
 [data = bntx.res]

Conditional Distribution:
 norm 

Coefficient(s):
         mu        omega       alpha1       alpha2       alpha3       alpha4  
-3.6411e-05   6.7567e-05   1.9846e-01   9.0635e-02   9.5246e-02   1.0000e-08  
     alpha5       alpha6        beta1        beta2        beta3        beta4  
 3.8403e-02   1.0000e-08   1.0000e-08   1.0000e-08   1.0000e-08   1.5299e-02  
      beta5        beta6        beta7  
 1.0000e-08   1.7808e-01   3.6609e-01  

Std. Errors:
 based on Hessian 

Error Analysis:
         Estimate  Std. Error  t value Pr(>|t|)    
mu     -3.641e-05   9.647e-04   -0.038  0.96989    
omega   6.757e-05   2.335e-05    2.893  0.00381 ** 
alpha1  1.985e-01   3.034e-02    6.542 6.09e-11 ***
alpha2  9.064e-02         NaN      NaN      NaN    
alpha3  9.525e-02         NaN      NaN      NaN    
alpha4  1.000e-08   3.185e-02    0.000  1.00000    
alpha5  3.840e-02         NaN      NaN      NaN    
alpha6  1.000e-08         NaN      NaN      NaN    
beta1   1.000e-08         NaN      NaN      NaN    
beta2   1.000e-08   1.059e-01    0.000  1.00000    
beta3   1.000e-08         NaN      NaN      NaN    
beta4   1.530e-02   1.406e-01    0.109  0.91335    
beta5   1.000e-08         NaN      NaN      NaN    
beta6   1.781e-01   6.407e-02    2.780  0.00544 ** 
beta7   3.661e-01   8.645e-02    4.235 2.29e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 2012.333    normalized:  1.768307 

Description:
 Mon Apr 22 20:35:34 2024 by user:  


Standardised Residuals Tests:
                                  Statistic      p-Value
 Jarque-Bera Test   R    Chi^2  143.7057801 0.000000e+00
 Shapiro-Wilk Test  R    W        0.9819121 1.054792e-10
 Ljung-Box Test     R    Q(10)   13.2870857 2.080605e-01
 Ljung-Box Test     R    Q(15)   15.7423290 3.993803e-01
 Ljung-Box Test     R    Q(20)   17.4310950 6.248295e-01
 Ljung-Box Test     R^2  Q(10)    6.8236480 7.419822e-01
 Ljung-Box Test     R^2  Q(15)   12.7664168 6.203343e-01
 Ljung-Box Test     R^2  Q(20)   21.5969488 3.627755e-01
 LM Arch Test       R    TR^2     8.1994075 7.693596e-01

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-3.510251 -3.443858 -3.510593 -3.485176 

We can see from the table above that some coefficients are not significant. I am going to try simpler model like GARCH(1,2) next.

Code
summary(garchFit(~garch(1,2), bntx.res,trace = F))

Title:
 GARCH Modelling 

Call:
 garchFit(formula = ~garch(1, 2), data = bntx.res, trace = F) 

Mean and Variance Equation:
 data ~ garch(1, 2)
<environment: 0x12f7caa98>
 [data = bntx.res]

Conditional Distribution:
 norm 

Coefficient(s):
         mu        omega       alpha1        beta1        beta2  
-3.6411e-05   2.7740e-05   1.7727e-01   1.7158e-01   6.4789e-01  

Std. Errors:
 based on Hessian 

Error Analysis:
         Estimate  Std. Error  t value Pr(>|t|)    
mu     -3.641e-05   9.982e-04   -0.036   0.9709    
omega   2.774e-05   1.056e-05    2.628   0.0086 ** 
alpha1  1.773e-01   2.826e-02    6.273 3.54e-10 ***
beta1   1.716e-01   7.266e-02    2.361   0.0182 *  
beta2   6.479e-01   7.470e-02    8.673  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 2002.911    normalized:  1.760027 

Description:
 Mon Apr 22 20:35:34 2024 by user:  


Standardised Residuals Tests:
                                  Statistic      p-Value
 Jarque-Bera Test   R    Chi^2  146.5900391 0.000000e+00
 Shapiro-Wilk Test  R    W        0.9815654 7.747998e-11
 Ljung-Box Test     R    Q(10)   14.6070322 1.470590e-01
 Ljung-Box Test     R    Q(15)   17.3692381 2.972746e-01
 Ljung-Box Test     R    Q(20)   18.7540812 5.378588e-01
 Ljung-Box Test     R^2  Q(10)   12.3966621 2.593860e-01
 Ljung-Box Test     R^2  Q(15)   21.1449486 1.322617e-01
 Ljung-Box Test     R^2  Q(20)   29.3673442 8.077415e-02
 LM Arch Test       R    TR^2    14.3682200 2.778150e-01

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-3.511267 -3.489136 -3.511306 -3.502909 

All of the coefficients are significant for GARCH(1,2). Thus the best model according to method 1 for the BioNTech SE returns is ARIMA(2,0,1) + GARCH(1,2).

9. Method 2: GARCH Model

Despite the notable coefficients observed for the GARCH model in Method 1, the ARIMA model initially lacked significant coefficients. Hence, I am inclined to explore Method 2, which entails directly fitting the GARCH model on the returns.

Code
model <- list() ## set counter
cc <- 1
for (p in 1:7) {
  for (q in 1:7) {
  
model[[cc]] <- garch(pfe.res,order=c(q,p),trace=F)
cc <- cc + 1
}
} 

## get AIC values for model evaluation
GARCH_AIC <- sapply(model, AIC) ## model with lowest AIC is the best
which(GARCH_AIC == min(GARCH_AIC))
[1] 2

The best model according to AIC is:

Code
model[[which(GARCH_AIC == min(GARCH_AIC))]]

Call:
garch(x = pfe.res, order = c(q, p), trace = F)

Coefficient(s):
       a0         a1         b1         b2  
6.877e-06  1.476e-01  5.693e-01  2.568e-01  

The most optimal model, indicated by the smallest AIC, is the GARCH(1,1) model. In the subsequent code, I will delve into the diagnostics of this top-performing model from the previous section. Should the evaluation metrics, including significance levels, fall short of expectations, I will consider training an alternative GARCH model.

Code
summary(garchFit(~garch(1,1), pfe.res,trace = F))

Title:
 GARCH Modelling 

Call:
 garchFit(formula = ~garch(1, 1), data = pfe.res, trace = F) 

Mean and Variance Equation:
 data ~ garch(1, 1)
<environment: 0x13c95a358>
 [data = pfe.res]

Conditional Distribution:
 norm 

Coefficient(s):
        mu       omega      alpha1       beta1  
6.1467e-05  6.2071e-06  1.2562e-01  8.5028e-01  

Std. Errors:
 based on Hessian 

Error Analysis:
        Estimate  Std. Error  t value Pr(>|t|)    
mu     6.147e-05   2.211e-04    0.278    0.781    
omega  6.207e-06   1.389e-06    4.469 7.86e-06 ***
alpha1 1.256e-01   1.652e-02    7.602 2.91e-14 ***
beta1  8.503e-01   1.884e-02   45.141  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 7634.289    normalized:  2.946464 

Description:
 Mon Apr 22 20:35:35 2024 by user:  


Standardised Residuals Tests:
                                  Statistic    p-Value
 Jarque-Bera Test   R    Chi^2  1197.863107 0.00000000
 Shapiro-Wilk Test  R    W         0.968343 0.00000000
 Ljung-Box Test     R    Q(10)    13.956483 0.17498582
 Ljung-Box Test     R    Q(15)    15.348289 0.42663242
 Ljung-Box Test     R    Q(20)    21.055158 0.39389243
 Ljung-Box Test     R^2  Q(10)    19.889346 0.03031683
 Ljung-Box Test     R^2  Q(15)    26.134182 0.03663573
 Ljung-Box Test     R^2  Q(20)    26.902966 0.13801612
 LM Arch Test       R    TR^2     22.470682 0.03257096

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-5.889841 -5.880794 -5.889845 -5.886562 

All of the coefficients are significant for GARCH(1,1). Thus the best model according to method 2 for the Pfizer returns is GARCH(1,1).

Code
model <- list() ## set counter
cc <- 1
for (p in 1:7) {
  for (q in 1:7) {
  
model[[cc]] <- garch(bntx.res,order=c(q,p),trace=F)
cc <- cc + 1
}
} 

## get AIC values for model evaluation
GARCH_AIC <- sapply(model, AIC) ## model with lowest AIC is the best
which(GARCH_AIC == min(GARCH_AIC))
[1] 1

The best model according to AIC is:

Code
model[[which(GARCH_AIC == min(GARCH_AIC))]]

Call:
garch(x = bntx.res, order = c(q, p), trace = F)

Coefficient(s):
       a0         a1         b1  
1.984e-05  1.282e-01  8.729e-01  

The most optimal model, indicated by the smallest AIC, is the GARCH(6,7) model. In the subsequent code, I will delve into the diagnostics of this top-performing model from the previous section. Should the evaluation metrics, including significance levels, fall short of expectations, I will consider training an alternative GARCH model.

Code
summary(garchFit(~garch(6,7), bntx.res,trace = F))

Title:
 GARCH Modelling 

Call:
 garchFit(formula = ~garch(6, 7), data = bntx.res, trace = F) 

Mean and Variance Equation:
 data ~ garch(6, 7)
<environment: 0x12f682a28>
 [data = bntx.res]

Conditional Distribution:
 norm 

Coefficient(s):
         mu        omega       alpha1       alpha2       alpha3       alpha4  
-3.6411e-05   6.7567e-05   1.9846e-01   9.0635e-02   9.5246e-02   1.0000e-08  
     alpha5       alpha6        beta1        beta2        beta3        beta4  
 3.8403e-02   1.0000e-08   1.0000e-08   1.0000e-08   1.0000e-08   1.5299e-02  
      beta5        beta6        beta7  
 1.0000e-08   1.7808e-01   3.6609e-01  

Std. Errors:
 based on Hessian 

Error Analysis:
         Estimate  Std. Error  t value Pr(>|t|)    
mu     -3.641e-05   9.647e-04   -0.038  0.96989    
omega   6.757e-05   2.335e-05    2.893  0.00381 ** 
alpha1  1.985e-01   3.034e-02    6.542 6.09e-11 ***
alpha2  9.064e-02         NaN      NaN      NaN    
alpha3  9.525e-02         NaN      NaN      NaN    
alpha4  1.000e-08   3.185e-02    0.000  1.00000    
alpha5  3.840e-02         NaN      NaN      NaN    
alpha6  1.000e-08         NaN      NaN      NaN    
beta1   1.000e-08         NaN      NaN      NaN    
beta2   1.000e-08   1.059e-01    0.000  1.00000    
beta3   1.000e-08         NaN      NaN      NaN    
beta4   1.530e-02   1.406e-01    0.109  0.91335    
beta5   1.000e-08         NaN      NaN      NaN    
beta6   1.781e-01   6.407e-02    2.780  0.00544 ** 
beta7   3.661e-01   8.645e-02    4.235 2.29e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 2012.333    normalized:  1.768307 

Description:
 Mon Apr 22 20:35:35 2024 by user:  


Standardised Residuals Tests:
                                  Statistic      p-Value
 Jarque-Bera Test   R    Chi^2  143.7057801 0.000000e+00
 Shapiro-Wilk Test  R    W        0.9819121 1.054792e-10
 Ljung-Box Test     R    Q(10)   13.2870857 2.080605e-01
 Ljung-Box Test     R    Q(15)   15.7423290 3.993803e-01
 Ljung-Box Test     R    Q(20)   17.4310950 6.248295e-01
 Ljung-Box Test     R^2  Q(10)    6.8236480 7.419822e-01
 Ljung-Box Test     R^2  Q(15)   12.7664168 6.203343e-01
 Ljung-Box Test     R^2  Q(20)   21.5969488 3.627755e-01
 LM Arch Test       R    TR^2     8.1994075 7.693596e-01

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-3.510251 -3.443858 -3.510593 -3.485176 

We can see from the table above that some coefficients are not significant. I am going to try simpler model like GARCH(1,1) next.

Code
summary(garchFit(~garch(1,1), bntx.res,trace = F))

Title:
 GARCH Modelling 

Call:
 garchFit(formula = ~garch(1, 1), data = bntx.res, trace = F) 

Mean and Variance Equation:
 data ~ garch(1, 1)
<environment: 0x13bbb7a70>
 [data = bntx.res]

Conditional Distribution:
 norm 

Coefficient(s):
         mu        omega       alpha1        beta1  
-3.6411e-05   1.9739e-05   1.2842e-01   8.7284e-01  

Std. Errors:
 based on Hessian 

Error Analysis:
         Estimate  Std. Error  t value Pr(>|t|)    
mu     -3.641e-05   1.011e-03   -0.036  0.97127    
omega   1.974e-05   7.518e-06    2.625  0.00865 ** 
alpha1  1.284e-01   2.114e-02    6.074 1.24e-09 ***
beta1   8.728e-01   1.817e-02   48.049  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 1995.254    normalized:  1.753298 

Description:
 Mon Apr 22 20:35:35 2024 by user:  


Standardised Residuals Tests:
                                  Statistic      p-Value
 Jarque-Bera Test   R    Chi^2  211.8127700 0.000000e+00
 Shapiro-Wilk Test  R    W        0.9779615 3.825942e-12
 Ljung-Box Test     R    Q(10)   11.7767310 3.002770e-01
 Ljung-Box Test     R    Q(15)   14.2341257 5.078518e-01
 Ljung-Box Test     R    Q(20)   15.6964910 7.352659e-01
 Ljung-Box Test     R^2  Q(10)   17.3663044 6.664147e-02
 Ljung-Box Test     R^2  Q(15)   24.8767411 5.162346e-02
 Ljung-Box Test     R^2  Q(20)   32.8169702 3.533394e-02
 LM Arch Test       R    TR^2    17.6186971 1.277682e-01

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-3.499567 -3.481862 -3.499592 -3.492880 

All of the coefficients are significant for GARCH(1,1). Thus the best model according to method 2 for the BioNTech SE returns is GARCH(1,1).

10. Model Diagnostics

Method 1: ARIMA(3,0,2) + GARCH(1,1)

Code
summary(fit_pfe)
Series: returns_pfe 
ARIMA(3,0,2) with non-zero mean 

Coefficients:
          ar1     ar2     ar3     ma1      ma2   mean
      -0.8462  0.0045  0.0485  0.8253  -0.0325  1e-04
s.e.   1.2263  1.1206  0.0708  1.2291   1.1049  3e-04

sigma^2 = 2e-04:  log likelihood = 7360.33
AIC=-14706.67   AICc=-14706.63   BIC=-14665.65

Training set error measures:
                        ME       RMSE         MAE MPE MAPE      MASE
Training set -6.146671e-06 0.01412681 0.009873244 NaN  Inf 0.6517835
                    ACF1
Training set 0.000395289
Code
garch_pfe <- garchFit(~garch(1,1), pfe.res,trace = F)
summary(garch_pfe)

Title:
 GARCH Modelling 

Call:
 garchFit(formula = ~garch(1, 1), data = pfe.res, trace = F) 

Mean and Variance Equation:
 data ~ garch(1, 1)
<environment: 0x12cccbc50>
 [data = pfe.res]

Conditional Distribution:
 norm 

Coefficient(s):
        mu       omega      alpha1       beta1  
6.1467e-05  6.2071e-06  1.2562e-01  8.5028e-01  

Std. Errors:
 based on Hessian 

Error Analysis:
        Estimate  Std. Error  t value Pr(>|t|)    
mu     6.147e-05   2.211e-04    0.278    0.781    
omega  6.207e-06   1.389e-06    4.469 7.86e-06 ***
alpha1 1.256e-01   1.652e-02    7.602 2.91e-14 ***
beta1  8.503e-01   1.884e-02   45.141  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 7634.289    normalized:  2.946464 

Description:
 Mon Apr 22 20:35:35 2024 by user:  


Standardised Residuals Tests:
                                  Statistic    p-Value
 Jarque-Bera Test   R    Chi^2  1197.863107 0.00000000
 Shapiro-Wilk Test  R    W         0.968343 0.00000000
 Ljung-Box Test     R    Q(10)    13.956483 0.17498582
 Ljung-Box Test     R    Q(15)    15.348289 0.42663242
 Ljung-Box Test     R    Q(20)    21.055158 0.39389243
 Ljung-Box Test     R^2  Q(10)    19.889346 0.03031683
 Ljung-Box Test     R^2  Q(15)    26.134182 0.03663573
 Ljung-Box Test     R^2  Q(20)    26.902966 0.13801612
 LM Arch Test       R    TR^2     22.470682 0.03257096

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-5.889841 -5.880794 -5.889845 -5.886562 
Code
summary(garch(pfe.res, order = c(1,1),trace = F))

Call:
garch(x = pfe.res, order = c(1, 1), trace = F)

Model:
GARCH(1,1)

Residuals:
      Min        1Q    Median        3Q       Max 
-6.368810 -0.560576 -0.008048  0.540952  5.540625 

Coefficient(s):
    Estimate  Std. Error  t value Pr(>|t|)    
a0 1.146e-05   1.023e-06    11.20   <2e-16 ***
a1 2.455e-01   1.699e-02    14.45   <2e-16 ***
b1 7.400e-01   1.180e-02    62.70   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Diagnostic Tests:
    Jarque Bera Test

data:  Residuals
X-squared = 1237.2, df = 2, p-value < 2.2e-16


    Box-Ljung test

data:  Squared.Residuals
X-squared = 1.1271, df = 1, p-value = 0.2884
Code
checkresiduals(garch(pfe.res, order = c(1,1),trace = F))


    Ljung-Box test

data:  Residuals
Q* = 524.38, df = 518, p-value = 0.4137

Model df: 0.   Total lags used: 518

Method 2: GARCH(1,1)

Code
garch_pfe2 <- garch(returns_pfe, order = c(1,1), trace = F)
summary(garch_pfe2)

Call:
garch(x = returns_pfe, order = c(1, 1), trace = F)

Model:
GARCH(1,1)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.9067 -0.5504  0.0000  0.5592  6.1103 

Coefficient(s):
    Estimate  Std. Error  t value Pr(>|t|)    
a0 6.458e-06   7.236e-07    8.925   <2e-16 ***
a1 1.317e-01   8.732e-03   15.081   <2e-16 ***
b1 8.436e-01   9.344e-03   90.282   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Diagnostic Tests:
    Jarque Bera Test

data:  Residuals
X-squared = 1165.5, df = 2, p-value < 2.2e-16


    Box-Ljung test

data:  Squared.Residuals
X-squared = 0.3956, df = 1, p-value = 0.5294
Code
checkresiduals(garch_pfe2)


    Ljung-Box test

data:  Residuals
Q* = 485.28, df = 518, p-value = 0.8457

Model df: 0.   Total lags used: 518

Based on the model diagnostics of these two models derived from distinct methods, it’s evident that their performance is comparable and satisfactory. Both models appear to have good residual plots, with only a few significant lags in the ACF plots. The AIC for both models are low and similar to each other. However, the results for the Ljung-Box test show a p-value over 0.05 for Model 1 and Model 2. This shows that there may still be some correlation not completely captured by the model. The coefficients for the ARIMA model were not all significant, while the coefficients for the GARCH model were all significant. Consequently, we opt to conduct cross-validation in the subsequent section to aid us in determining the optimal model.

Method 1: ARIMA(2,0,1) + GARCH(1,1)

Code
summary(fit_bntx)
Series: returns_bntx 
ARIMA(2,0,1) with non-zero mean 

Coefficients:
         ar1     ar2      ma1    mean
      0.6724  -0.072  -0.6596  0.0016
s.e.  0.1706   0.031   0.1695  0.0013

sigma^2 = 0.00277:  log likelihood = 1738.01
AIC=-3466.01   AICc=-3465.96   BIC=-3440.83

Training set error measures:
                       ME       RMSE        MAE MPE MAPE      MASE
Training set 3.641115e-06 0.05253933 0.03461972 NaN  Inf 0.6555673
                      ACF1
Training set -0.0004177628
Code
garch_bntx <- garchFit(~garch(1,1), bntx.res,trace = F)
summary(garch_bntx)

Title:
 GARCH Modelling 

Call:
 garchFit(formula = ~garch(1, 1), data = bntx.res, trace = F) 

Mean and Variance Equation:
 data ~ garch(1, 1)
<environment: 0x12e3e1a18>
 [data = bntx.res]

Conditional Distribution:
 norm 

Coefficient(s):
         mu        omega       alpha1        beta1  
-3.6411e-05   1.9739e-05   1.2842e-01   8.7284e-01  

Std. Errors:
 based on Hessian 

Error Analysis:
         Estimate  Std. Error  t value Pr(>|t|)    
mu     -3.641e-05   1.011e-03   -0.036  0.97127    
omega   1.974e-05   7.518e-06    2.625  0.00865 ** 
alpha1  1.284e-01   2.114e-02    6.074 1.24e-09 ***
beta1   8.728e-01   1.817e-02   48.049  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 1995.254    normalized:  1.753298 

Description:
 Mon Apr 22 20:35:36 2024 by user:  


Standardised Residuals Tests:
                                  Statistic      p-Value
 Jarque-Bera Test   R    Chi^2  211.8127700 0.000000e+00
 Shapiro-Wilk Test  R    W        0.9779615 3.825942e-12
 Ljung-Box Test     R    Q(10)   11.7767310 3.002770e-01
 Ljung-Box Test     R    Q(15)   14.2341257 5.078518e-01
 Ljung-Box Test     R    Q(20)   15.6964910 7.352659e-01
 Ljung-Box Test     R^2  Q(10)   17.3663044 6.664147e-02
 Ljung-Box Test     R^2  Q(15)   24.8767411 5.162346e-02
 Ljung-Box Test     R^2  Q(20)   32.8169702 3.533394e-02
 LM Arch Test       R    TR^2    17.6186971 1.277682e-01

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-3.499567 -3.481862 -3.499592 -3.492880 
Code
summary(garch(bntx.res, order = c(1,1),trace = F))

Call:
garch(x = bntx.res, order = c(1, 1), trace = F)

Model:
GARCH(1,1)

Residuals:
     Min       1Q   Median       3Q      Max 
-5.26030 -0.60758 -0.05379  0.50461  4.55463 

Coefficient(s):
    Estimate  Std. Error  t value Pr(>|t|)    
a0 1.984e-05   4.319e-06    4.593 4.38e-06 ***
a1 1.282e-01   9.434e-03   13.594  < 2e-16 ***
b1 8.729e-01   7.647e-03  114.136  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Diagnostic Tests:
    Jarque Bera Test

data:  Residuals
X-squared = 204.37, df = 2, p-value < 2.2e-16


    Box-Ljung test

data:  Squared.Residuals
X-squared = 8.3991, df = 1, p-value = 0.003754
Code
checkresiduals(garch(bntx.res, order = c(1,1),trace = F))


    Ljung-Box test

data:  Residuals
Q* = 261.51, df = 228, p-value = 0.06311

Model df: 0.   Total lags used: 228

Method 2: GARCH(1,1)

Code
garch_bntx2 <- garch(returns_bntx, order = c(1,1), trace = F)
summary(garch_bntx2)

Call:
garch(x = returns_bntx, order = c(1, 1), trace = F)

Model:
GARCH(1,1)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.2537 -0.5702  0.0000  0.5491  6.6602 

Coefficient(s):
    Estimate  Std. Error  t value Pr(>|t|)    
a0 1.984e-05   4.340e-06     4.57 4.87e-06 ***
a1 1.352e-01   9.643e-03    14.02  < 2e-16 ***
b1 8.674e-01   7.484e-03   115.90  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Diagnostic Tests:
    Jarque Bera Test

data:  Residuals
X-squared = 546.44, df = 2, p-value < 2.2e-16


    Box-Ljung test

data:  Squared.Residuals
X-squared = 6.0678, df = 1, p-value = 0.01377
Code
checkresiduals(garch_bntx2)


    Ljung-Box test

data:  Residuals
Q* = 223.42, df = 228, p-value = 0.5733

Model df: 0.   Total lags used: 228

Based on the model diagnostics of these two models derived from distinct methods, it’s evident that their performance is comparable and satisfactory. Both models appear to have good residual plots, with only a few significant lags in the ACF plots. The AIC for both models are low and similar to each other. However, the results for the Ljung-Box test show a p-value over 0.05 for Model 1 and Model 2. This shows that there may still be some correlation not completely captured by the model. The coefficients for the ARIMA model were not all significant, while the coefficients for the GARCH model were all significant. Consequently, we opt to conduct cross-validation in the subsequent section to aid us in determining the optimal model.

11. Cross-Validation

Code
k <- 700 # minimum data length for fitting a model 3324*.2 (20%)
n <- length(returns_pfe)
n-k 
[1] 1891
Code
i=1
err1 = c()
err2 = c()

rmse1=c()
rmse2=c()

for(i in 1:100)
{
  xtrain <- returns_pfe[1:(k-1)+i] 
  xtest <- returns_pfe[k+i] 

  # ARIMA(3,0,2) + GARCH(1,1)
  arima.fit<-Arima(xtrain,order=c(3,0,2),include.drift = TRUE)
  arima.res <- residuals(arima.fit)
  fit1 <- garchFit(~garch(1,1), arima.res,trace = F)
  fcast1 <- predict(fit1, n.ahead=1)
  
  # GARCH(1,1)
  returns=diff(xtrain)
  fit2 <- garchFit(~garch(1,1), data = returns, trace = F) #this is fitted for returns
  fcast2 <- predict(fit2, n.ahead=1)
  
  
  err1 = c(err1, (fcast1$meanForecast-xtest)^2)
  err2 = c(err2, (fcast2$meanForecast-xtest)^2)

  
}

RMSE1=sqrt(mean(err1)) 
RMSE2=sqrt(mean(err2))

RMSE1
[1] 0.01224339
Code
RMSE2
[1] 0.01223838

Best Model: The two models identified by the methods were Model 1: ARIMA(3,0,2) + GARCH(1,1) and Model 2: GARCH(1,1). After performing cross-validation, we can see that Model 2 had the lowest RMSE. Therefore, the best model according to diagnostics is Model 2: GARCH(1,1).

Code
k <- 400 # minimum data length for fitting a model 3324*.2 (20%)
n <- length(returns_bntx)
n-k 
[1] 738
Code
i=1
err1 = c()
err2 = c()

rmse1=c()
rmse2=c()

for(i in 1:100)
{
  xtrain <- returns_bntx[1:(k-1)+i] 
  xtest <- returns_bntx[k+i] 

  # ARIMA(2,0,1) + GARCH(1,1)
  arima.fit<-Arima(xtrain,order=c(2,0,1),include.drift = TRUE)
  arima.res <- residuals(arima.fit)
  fit1 <- garchFit(~garch(1,1), arima.res,trace = F)
  fcast1 <- predict(fit1, n.ahead=1)
  
  # GARCH(1,1)
  returns=diff(xtrain)
  fit2 <- garchFit(~garch(1,1), data = returns, trace = F) #this is fitted for returns
  fcast2 <- predict(fit2, n.ahead=1)
  
  
  err1 = c(err1, (fcast1$meanForecast-xtest)^2)
  err2 = c(err2, (fcast2$meanForecast-xtest)^2)

  
}

RMSE1=sqrt(mean(err1)) 
RMSE2=sqrt(mean(err2))

RMSE1
[1] 0.05096422
Code
RMSE2
[1] 0.05096376

Best Model: The two models identified by the methods were Model 1: ARIMA(2,0,1) + GARCH(1,1) and Model 2: GARCH(1,1). After performing cross-validation, we can see that Model 2 had the lowest RMSE. Therefore, the best model according to diagnostics is Model 2: GARCH(1,1).

12. Final Model Fitting

We fitted the final model with the best performance.

Code
garch_pfe3 <- garchFit(~garch(1,1), returns_pfe,trace = F)
summary(garch_pfe3)

Title:
 GARCH Modelling 

Call:
 garchFit(formula = ~garch(1, 1), data = returns_pfe, trace = F) 

Mean and Variance Equation:
 data ~ garch(1, 1)
<environment: 0x13a659198>
 [data = returns_pfe]

Conditional Distribution:
 norm 

Coefficient(s):
        mu       omega      alpha1       beta1  
3.4148e-04  6.3708e-06  1.3150e-01  8.4422e-01  

Std. Errors:
 based on Hessian 

Error Analysis:
        Estimate  Std. Error  t value Pr(>|t|)    
mu     3.415e-04   2.195e-04    1.556     0.12    
omega  6.371e-06   1.383e-06    4.607 4.09e-06 ***
alpha1 1.315e-01   1.678e-02    7.834 4.66e-15 ***
beta1  8.442e-01   1.886e-02   44.753  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 7640.193    normalized:  2.948743 

Description:
 Mon Apr 22 20:35:46 2024 by user:  


Standardised Residuals Tests:
                                   Statistic    p-Value
 Jarque-Bera Test   R    Chi^2  1167.4713493 0.00000000
 Shapiro-Wilk Test  R    W         0.9686739 0.00000000
 Ljung-Box Test     R    Q(10)     7.2609671 0.70059971
 Ljung-Box Test     R    Q(15)     9.0490362 0.87493956
 Ljung-Box Test     R    Q(20)    14.7527009 0.79037889
 Ljung-Box Test     R^2  Q(10)    19.9311605 0.02991056
 Ljung-Box Test     R^2  Q(15)    25.8387895 0.03975168
 Ljung-Box Test     R^2  Q(20)    26.6712042 0.14477041
 LM Arch Test       R    TR^2     22.4587975 0.03268816

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-5.894398 -5.885352 -5.894403 -5.891120 

Based on the Ljung-Box test:

  1. Ljung-Box Test for Residuals (R):
    • For lags Q(10), Q(15), and Q(20), the p-values are 0.70060151, 0.87494061, and 0.79038025, respectively. Since all these p-values are well above the common significance level of 0.05, we do not reject the null hypothesis that there is no autocorrelation in the residuals at these lags. This suggests that the model residuals are independent at the 10, 15, and 20 lag intervals, which is an indication of a good model fit.
  2. Ljung-Box Test for Squared Residuals (R^2):
    • The test results for squared residuals, which are used to check for autocorrelation in the volatility (variance) of the residuals, show a different picture. For Q(10) and Q(15), the p-values are 0.02990460 and 0.03974478, which are below the 0.05 threshold, indicating that there is significant autocorrelation in the squared residuals at these lags. This could be a sign of heteroskedasticity or volatility clustering, suggesting that a model that accounts for changing variance over time, like GARCH, might be more appropriate.
    • However, at lag Q(20), the p-value is 0.14475310, which is above 0.05, suggesting no significant autocorrelation in the volatility at this lag interval.
  3. LM Arch Test:
    • The Lagrange Multiplier (LM) test for autoregressive conditional heteroskedasticity (ARCH) effects, indicated by a p-value of 0.03268196, also points to the presence of heteroskedasticity in the residuals.
  4. Jarque-Bera and Shapiro-Wilk Tests:
    • Both tests have very low p-values, effectively 0, rejecting the null hypothesis of normality. This indicates that the residuals are not normally distributed, which is a common assumption in many modeling scenarios.

The Ljung-Box test results on the residuals (R) suggest the model captures the central tendency (mean) of the data well, but the tests on squared residuals (R^2) and the LM Arch test reveal that the model may not be adequately capturing the variance in the data. This could imply that the time series has conditional heteroskedasticity that the current model does not account for. It might be beneficial to consider a model that can model such characteristics, like a GARCH model, especially since the residuals do not appear to be normally distributed.

Code
garch_bntx3 <- garchFit(~garch(1,1), returns_bntx,trace = F)
summary(garch_bntx3)

Title:
 GARCH Modelling 

Call:
 garchFit(formula = ~garch(1, 1), data = returns_bntx, trace = F) 

Mean and Variance Equation:
 data ~ garch(1, 1)
<environment: 0x12ceb84f0>
 [data = returns_bntx]

Conditional Distribution:
 norm 

Coefficient(s):
         mu        omega       alpha1        beta1  
-9.6626e-04   1.9556e-05   1.3600e-01   8.6697e-01  

Std. Errors:
 based on Hessian 

Error Analysis:
         Estimate  Std. Error  t value Pr(>|t|)    
mu     -9.663e-04   9.471e-04   -1.020  0.30762    
omega   1.956e-05   7.301e-06    2.679  0.00739 ** 
alpha1  1.360e-01   2.124e-02    6.403 1.53e-10 ***
beta1   8.670e-01   1.769e-02   49.013  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log Likelihood:
 2003.922    normalized:  1.760916 

Description:
 Mon Apr 22 20:35:46 2024 by user:  


Standardised Residuals Tests:
                                 Statistic      p-Value
 Jarque-Bera Test   R    Chi^2  210.209756 0.000000e+00
 Shapiro-Wilk Test  R    W        0.977673 3.049831e-12
 Ljung-Box Test     R    Q(10)    2.584753 9.895823e-01
 Ljung-Box Test     R    Q(15)    5.051780 9.916789e-01
 Ljung-Box Test     R    Q(20)    6.688114 9.975896e-01
 Ljung-Box Test     R^2  Q(10)   16.444615 8.759259e-02
 Ljung-Box Test     R^2  Q(15)   24.806730 5.259999e-02
 Ljung-Box Test     R^2  Q(20)   31.933499 4.401209e-02
 LM Arch Test       R    TR^2    17.258632 1.401247e-01

Information Criterion Statistics:
      AIC       BIC       SIC      HQIC 
-3.514801 -3.497096 -3.514826 -3.508115 

The Ljung-Box test results provided for the residuals of your model show p-values well above the common significance level of 0.05 for Q(10), Q(15), and Q(20). This implies that there is no significant autocorrelation at the first 10, 15, or 20 lags of the residuals, indicating that the model captures the data’s time series structure quite well.

However, when you square the residuals (possibly to check for heteroscedasticity or nonlinearity), the Ljung-Box test results on R^2 indicate that there may be some structure left in the squared residuals. Specifically, the p-values for Q(15) and Q(20) are approaching the threshold of 0.05, suggesting that there could be some autocorrelation in the volatility of the series (conditional heteroscedasticity).

The Jarque-Bera and Shapiro-Wilk tests reject the null hypothesis of the residuals’ normality with very small p-values. This indicates that the residuals do not follow a normal distribution, which can be critical if the statistical inference of subsequent analysis relies on the assumption of normally distributed errors.

The LM Arch Test, with a p-value above 0.05, suggests there is no significant autoregressive conditional heteroscedasticity (ARCH) effects in the residuals.

Considering the information criteria statistics (AIC, BIC, SIC, HQIC), which are measures of the relative quality of the statistical model for a given set of data, lower values generally indicate a better-fitting model. However, these need to be compared across different models to select the best among them.

In summary, while the Box Ljung test results suggest that the model captures the linear dependencies of the data well, the tests for normality and the borderline p-values for the squared residuals’ autocorrelation suggest that the model residuals are not normally distributed, and there might be some remaining non-linear dependencies in the data. This could mean that while the model’s linear specification seems appropriate, further investigation into potential nonlinearity or heteroscedasticity in the series might be warranted.

13. Final Model Selection & Forecasting

Then we used the final model to forecast the future trend of stock prices.

Code
garch_pfe3 <- garchFit(~garch(1,1), returns_pfe,trace = F)
predict(garch_pfe3, n.ahead = 100, plot=TRUE)

    meanForecast  meanError standardDeviation lowerInterval upperInterval
1   0.0003414805 0.01356951        0.01356951   -0.02625427    0.02693723
2   0.0003414805 0.01363937        0.01363937   -0.02639119    0.02707415
3   0.0003414805 0.01370718        0.01370718   -0.02652411    0.02720707
4   0.0003414805 0.01377303        0.01377303   -0.02665317    0.02733613
5   0.0003414805 0.01383698        0.01383698   -0.02677851    0.02746147
6   0.0003414805 0.01389909        0.01389909   -0.02690025    0.02758321
7   0.0003414805 0.01395943        0.01395943   -0.02701851    0.02770147
8   0.0003414805 0.01401806        0.01401806   -0.02713341    0.02781637
9   0.0003414805 0.01407502        0.01407502   -0.02724505    0.02792802
10  0.0003414805 0.01413038        0.01413038   -0.02735356    0.02803652
11  0.0003414805 0.01418419        0.01418419   -0.02745902    0.02814198
12  0.0003414805 0.01423650        0.01423650   -0.02756154    0.02824450
13  0.0003414805 0.01428735        0.01428735   -0.02766121    0.02834417
14  0.0003414805 0.01433679        0.01433679   -0.02775811    0.02844107
15  0.0003414805 0.01438487        0.01438487   -0.02785235    0.02853531
16  0.0003414805 0.01443163        0.01443163   -0.02794399    0.02862695
17  0.0003414805 0.01447710        0.01447710   -0.02803312    0.02871608
18  0.0003414805 0.01452134        0.01452134   -0.02811982    0.02880278
19  0.0003414805 0.01456437        0.01456437   -0.02820416    0.02888712
20  0.0003414805 0.01460624        0.01460624   -0.02828622    0.02896918
21  0.0003414805 0.01464697        0.01464697   -0.02836605    0.02904901
22  0.0003414805 0.01468661        0.01468661   -0.02844374    0.02912670
23  0.0003414805 0.01472518        0.01472518   -0.02851933    0.02920230
24  0.0003414805 0.01476271        0.01476271   -0.02859291    0.02927587
25  0.0003414805 0.01479925        0.01479925   -0.02866451    0.02934747
26  0.0003414805 0.01483481        0.01483481   -0.02873421    0.02941717
27  0.0003414805 0.01486942        0.01486942   -0.02880205    0.02948501
28  0.0003414805 0.01490312        0.01490312   -0.02886810    0.02955106
29  0.0003414805 0.01493593        0.01493593   -0.02893240    0.02961536
30  0.0003414805 0.01496787        0.01496787   -0.02899500    0.02967796
31  0.0003414805 0.01499897        0.01499897   -0.02905595    0.02973891
32  0.0003414805 0.01502925        0.01502925   -0.02911531    0.02979827
33  0.0003414805 0.01505874        0.01505874   -0.02917310    0.02985606
34  0.0003414805 0.01508745        0.01508745   -0.02922939    0.02991235
35  0.0003414805 0.01511542        0.01511542   -0.02928420    0.02996716
36  0.0003414805 0.01514266        0.01514266   -0.02933759    0.03002055
37  0.0003414805 0.01516919        0.01516919   -0.02938959    0.03007255
38  0.0003414805 0.01519503        0.01519503   -0.02944024    0.03012320
39  0.0003414805 0.01522020        0.01522020   -0.02948957    0.03017253
40  0.0003414805 0.01524472        0.01524472   -0.02953763    0.03022059
41  0.0003414805 0.01526861        0.01526861   -0.02958445    0.03026741
42  0.0003414805 0.01529188        0.01529188   -0.02963006    0.03031302
43  0.0003414805 0.01531456        0.01531456   -0.02967450    0.03035746
44  0.0003414805 0.01533665        0.01533665   -0.02971779    0.03040075
45  0.0003414805 0.01535817        0.01535817   -0.02975998    0.03044294
46  0.0003414805 0.01537914        0.01537914   -0.02980108    0.03048404
47  0.0003414805 0.01539958        0.01539958   -0.02984113    0.03052409
48  0.0003414805 0.01541949        0.01541949   -0.02988016    0.03056312
49  0.0003414805 0.01543889        0.01543889   -0.02991819    0.03060115
50  0.0003414805 0.01545780        0.01545780   -0.02995525    0.03063822
51  0.0003414805 0.01547623        0.01547623   -0.02999137    0.03067433
52  0.0003414805 0.01549419        0.01549419   -0.03002657    0.03070954
53  0.0003414805 0.01551169        0.01551169   -0.03006088    0.03074384
54  0.0003414805 0.01552875        0.01552875   -0.03009432    0.03077728
55  0.0003414805 0.01554538        0.01554538   -0.03012691    0.03080987
56  0.0003414805 0.01556159        0.01556159   -0.03015867    0.03084163
57  0.0003414805 0.01557739        0.01557739   -0.03018964    0.03087260
58  0.0003414805 0.01559278        0.01559278   -0.03021982    0.03090278
59  0.0003414805 0.01560779        0.01560779   -0.03024923    0.03093220
60  0.0003414805 0.01562243        0.01562243   -0.03027791    0.03096087
61  0.0003414805 0.01563669        0.01563669   -0.03030587    0.03098883
62  0.0003414805 0.01565059        0.01565059   -0.03033312    0.03101608
63  0.0003414805 0.01566415        0.01566415   -0.03035969    0.03104265
64  0.0003414805 0.01567736        0.01567736   -0.03038559    0.03106855
65  0.0003414805 0.01569025        0.01569025   -0.03041084    0.03109380
66  0.0003414805 0.01570281        0.01570281   -0.03043545    0.03111842
67  0.0003414805 0.01571505        0.01571505   -0.03045946    0.03114242
68  0.0003414805 0.01572699        0.01572699   -0.03048286    0.03116582
69  0.0003414805 0.01573863        0.01573863   -0.03050567    0.03118863
70  0.0003414805 0.01574998        0.01574998   -0.03052792    0.03121088
71  0.0003414805 0.01576105        0.01576105   -0.03054960    0.03123257
72  0.0003414805 0.01577184        0.01577184   -0.03057075    0.03125371
73  0.0003414805 0.01578236        0.01578236   -0.03059137    0.03127433
74  0.0003414805 0.01579262        0.01579262   -0.03061148    0.03129444
75  0.0003414805 0.01580262        0.01580262   -0.03063109    0.03131405
76  0.0003414805 0.01581238        0.01581238   -0.03065021    0.03133317
77  0.0003414805 0.01582189        0.01582189   -0.03066885    0.03135181
78  0.0003414805 0.01583116        0.01583116   -0.03068703    0.03136999
79  0.0003414805 0.01584021        0.01584021   -0.03070476    0.03138772
80  0.0003414805 0.01584903        0.01584903   -0.03072204    0.03140500
81  0.0003414805 0.01585763        0.01585763   -0.03073890    0.03142186
82  0.0003414805 0.01586602        0.01586602   -0.03075534    0.03143830
83  0.0003414805 0.01587420        0.01587420   -0.03077138    0.03145434
84  0.0003414805 0.01588218        0.01588218   -0.03078701    0.03146997
85  0.0003414805 0.01588996        0.01588996   -0.03080226    0.03148522
86  0.0003414805 0.01589754        0.01589754   -0.03081713    0.03150009
87  0.0003414805 0.01590494        0.01590494   -0.03083164    0.03151460
88  0.0003414805 0.01591216        0.01591216   -0.03084578    0.03152874
89  0.0003414805 0.01591920        0.01591920   -0.03085958    0.03154254
90  0.0003414805 0.01592606        0.01592606   -0.03087303    0.03155599
91  0.0003414805 0.01593276        0.01593276   -0.03088615    0.03156911
92  0.0003414805 0.01593929        0.01593929   -0.03089895    0.03158191
93  0.0003414805 0.01594566        0.01594566   -0.03091143    0.03159439
94  0.0003414805 0.01595187        0.01595187   -0.03092361    0.03160657
95  0.0003414805 0.01595793        0.01595793   -0.03093548    0.03161844
96  0.0003414805 0.01596384        0.01596384   -0.03094707    0.03163003
97  0.0003414805 0.01596960        0.01596960   -0.03095836    0.03164132
98  0.0003414805 0.01597522        0.01597522   -0.03096938    0.03165234
99  0.0003414805 0.01598071        0.01598071   -0.03098013    0.03166309
100 0.0003414805 0.01598606        0.01598606   -0.03099061    0.03167357

Our Pfizer model forcasting can be evaluated using several factors. For example, the widening confidence intervals with increasing forecast horizon suggest greater uncertainty further into the future. The plot suggests that future values are expected to continue at the current level with a slight drift, and the symmetrical confidence intervals imply an assumption of normally distributed forecast errors.

Code
garch_bntx3 <- garchFit(~garch(1,1), returns_bntx,trace = F)
predict(garch_bntx3, n.ahead = 100, plot=TRUE)

     meanForecast  meanError standardDeviation lowerInterval upperInterval
1   -0.0009662578 0.01720081        0.01720081   -0.03467922    0.03274670
2   -0.0009662578 0.01778489        0.01778489   -0.03582401    0.03389149
3   -0.0009662578 0.01835205        0.01835205   -0.03693562    0.03500310
4   -0.0009662578 0.01890381        0.01890381   -0.03801705    0.03608453
5   -0.0009662578 0.01944149        0.01944149   -0.03907087    0.03713835
6   -0.0009662578 0.01996622        0.01996622   -0.04009933    0.03816681
7   -0.0009662578 0.02047901        0.02047901   -0.04110438    0.03917186
8   -0.0009662578 0.02098074        0.02098074   -0.04208775    0.04015523
9   -0.0009662578 0.02147218        0.02147218   -0.04305097    0.04111845
10  -0.0009662578 0.02195404        0.02195404   -0.04399539    0.04206288
11  -0.0009662578 0.02242694        0.02242694   -0.04492224    0.04298973
12  -0.0009662578 0.02289142        0.02289142   -0.04583262    0.04390010
13  -0.0009662578 0.02334801        0.02334801   -0.04672751    0.04479500
14  -0.0009662578 0.02379715        0.02379715   -0.04760782    0.04567530
15  -0.0009662578 0.02423927        0.02423927   -0.04847436    0.04654184
16  -0.0009662578 0.02467475        0.02467475   -0.04932788    0.04739536
17  -0.0009662578 0.02510393        0.02510393   -0.05016906    0.04823655
18  -0.0009662578 0.02552714        0.02552714   -0.05099854    0.04906603
19  -0.0009662578 0.02594468        0.02594468   -0.05181689    0.04988438
20  -0.0009662578 0.02635681        0.02635681   -0.05262466    0.05069214
21  -0.0009662578 0.02676379        0.02676379   -0.05342233    0.05148981
22  -0.0009662578 0.02716586        0.02716586   -0.05421037    0.05227785
23  -0.0009662578 0.02756323        0.02756323   -0.05498920    0.05305668
24  -0.0009662578 0.02795611        0.02795611   -0.05575922    0.05382671
25  -0.0009662578 0.02834468        0.02834468   -0.05652082    0.05458830
26  -0.0009662578 0.02872913        0.02872913   -0.05727433    0.05534181
27  -0.0009662578 0.02910963        0.02910963   -0.05802008    0.05608756
28  -0.0009662578 0.02948632        0.02948632   -0.05875838    0.05682587
29  -0.0009662578 0.02985936        0.02985936   -0.05948952    0.05755701
30  -0.0009662578 0.03022888        0.03022888   -0.06021377    0.05828125
31  -0.0009662578 0.03059501        0.03059501   -0.06093138    0.05899887
32  -0.0009662578 0.03095789        0.03095789   -0.06164260    0.05971009
33  -0.0009662578 0.03131762        0.03131762   -0.06234766    0.06041515
34  -0.0009662578 0.03167431        0.03167431   -0.06304677    0.06111425
35  -0.0009662578 0.03202808        0.03202808   -0.06374013    0.06180762
36  -0.0009662578 0.03237901        0.03237901   -0.06442795    0.06249543
37  -0.0009662578 0.03272720        0.03272720   -0.06511040    0.06317788
38  -0.0009662578 0.03307275        0.03307275   -0.06578765    0.06385514
39  -0.0009662578 0.03341573        0.03341573   -0.06645989    0.06452737
40  -0.0009662578 0.03375624        0.03375624   -0.06712726    0.06519475
41  -0.0009662578 0.03409433        0.03409433   -0.06778992    0.06585741
42  -0.0009662578 0.03443010        0.03443010   -0.06844801    0.06651550
43  -0.0009662578 0.03476361        0.03476361   -0.06910168    0.06716916
44  -0.0009662578 0.03509492        0.03509492   -0.06975104    0.06781853
45  -0.0009662578 0.03542411        0.03542411   -0.07039623    0.06846372
46  -0.0009662578 0.03575123        0.03575123   -0.07103738    0.06910486
47  -0.0009662578 0.03607634        0.03607634   -0.07167458    0.06974207
48  -0.0009662578 0.03639950        0.03639950   -0.07230796    0.07037545
49  -0.0009662578 0.03672076        0.03672076   -0.07293763    0.07100511
50  -0.0009662578 0.03704018        0.03704018   -0.07356368    0.07163116
51  -0.0009662578 0.03735781        0.03735781   -0.07418621    0.07225370
52  -0.0009662578 0.03767368        0.03767368   -0.07480532    0.07287280
53  -0.0009662578 0.03798786        0.03798786   -0.07542110    0.07348858
54  -0.0009662578 0.03830038        0.03830038   -0.07603363    0.07410111
55  -0.0009662578 0.03861129        0.03861129   -0.07664300    0.07471049
56  -0.0009662578 0.03892063        0.03892063   -0.07724929    0.07531678
57  -0.0009662578 0.03922844        0.03922844   -0.07785258    0.07592007
58  -0.0009662578 0.03953475        0.03953475   -0.07845295    0.07652043
59  -0.0009662578 0.03983961        0.03983961   -0.07905046    0.07711794
60  -0.0009662578 0.04014305        0.04014305   -0.07964519    0.07771267
61  -0.0009662578 0.04044510        0.04044510   -0.08023720    0.07830468
62  -0.0009662578 0.04074580        0.04074580   -0.08082656    0.07889405
63  -0.0009662578 0.04104518        0.04104518   -0.08141334    0.07948082
64  -0.0009662578 0.04134328        0.04134328   -0.08199759    0.08006507
65  -0.0009662578 0.04164011        0.04164011   -0.08257937    0.08064686
66  -0.0009662578 0.04193572        0.04193572   -0.08315875    0.08122624
67  -0.0009662578 0.04223012        0.04223012   -0.08373577    0.08180326
68  -0.0009662578 0.04252335        0.04252335   -0.08431050    0.08237798
69  -0.0009662578 0.04281544        0.04281544   -0.08488298    0.08295046
70  -0.0009662578 0.04310640        0.04310640   -0.08545326    0.08352074
71  -0.0009662578 0.04339627        0.04339627   -0.08602139    0.08408888
72  -0.0009662578 0.04368507        0.04368507   -0.08658743    0.08465491
73  -0.0009662578 0.04397282        0.04397282   -0.08715141    0.08521890
74  -0.0009662578 0.04425955        0.04425955   -0.08771339    0.08578087
75  -0.0009662578 0.04454528        0.04454528   -0.08827340    0.08634088
76  -0.0009662578 0.04483002        0.04483002   -0.08883148    0.08689897
77  -0.0009662578 0.04511381        0.04511381   -0.08938769    0.08745518
78  -0.0009662578 0.04539665        0.04539665   -0.08994206    0.08800954
79  -0.0009662578 0.04567858        0.04567858   -0.09049462    0.08856211
80  -0.0009662578 0.04595960        0.04595960   -0.09104543    0.08911291
81  -0.0009662578 0.04623975        0.04623975   -0.09159450    0.08966199
82  -0.0009662578 0.04651903        0.04651903   -0.09214189    0.09020937
83  -0.0009662578 0.04679747        0.04679747   -0.09268761    0.09075510
84  -0.0009662578 0.04707508        0.04707508   -0.09323172    0.09129921
85  -0.0009662578 0.04735188        0.04735188   -0.09377424    0.09184173
86  -0.0009662578 0.04762789        0.04762789   -0.09431521    0.09238269
87  -0.0009662578 0.04790312        0.04790312   -0.09485465    0.09292213
88  -0.0009662578 0.04817759        0.04817759   -0.09539259    0.09346008
89  -0.0009662578 0.04845131        0.04845131   -0.09592907    0.09399656
90  -0.0009662578 0.04872430        0.04872430   -0.09646412    0.09453161
91  -0.0009662578 0.04899657        0.04899657   -0.09699777    0.09506525
92  -0.0009662578 0.04926814        0.04926814   -0.09753003    0.09559752
93  -0.0009662578 0.04953902        0.04953902   -0.09806095    0.09612843
94  -0.0009662578 0.04980922        0.04980922   -0.09859054    0.09665803
95  -0.0009662578 0.05007877        0.05007877   -0.09911884    0.09718632
96  -0.0009662578 0.05034766        0.05034766   -0.09964586    0.09771335
97  -0.0009662578 0.05061592        0.05061592   -0.10017164    0.09823912
98  -0.0009662578 0.05088355        0.05088355   -0.10069619    0.09876368
99  -0.0009662578 0.05115058        0.05115058   -0.10121955    0.09928703
100 -0.0009662578 0.05141700        0.05141700   -0.10174173    0.09980921

Our BioNTech SE model forcasting can be evaluated using several factors. For example, the widening confidence intervals with increasing forecast horizon suggest greater uncertainty further into the future. The plot suggests that future values are expected to continue at the current level with a slight drift, and the symmetrical confidence intervals imply an assumption of normally distributed forecast errors.

14. Volatality Plot

Code
ht <- garch_pfe3@h.t 
pfe=data.frame(pfe)
pfe <- head(pfe, -1)
pfe <- data.frame(pfe,rownames(pfe))
colnames(pfe)[7] = "date"
pfe$date<-as.Date(pfe$date,"%Y-%m-%d")
data_spot = data.frame(ht,pfe$date)
ggplot(data_spot, aes(y = ht, x = pfe.date)) + geom_line(col = "#5a3196") + ylab('Conditional Variance') + xlab('Date')+ggtitle("Volatality plot for Pfizer") + theme_bw()

In our Pfizer Volatility plot we see the conditional variance fluctuating over time, with several noticeable spikes indicating periods of high volatility like 2020, 2022. These spikes likely correspond to specific events or releases that caused increased trading activity or uncertainty in Pfizer’s stock price. The periods between the spikes, where the variance is lower, represent more stable times with less fluctuation in the stock price.

Code
ht <- garch_bntx3@h.t 
bntx=data.frame(bntx)
bntx <- head(bntx, -1)
bntx <- data.frame(bntx,rownames(bntx))
colnames(bntx)[7] = "date"
bntx$date<-as.Date(bntx$date,"%Y-%m-%d")
data_spot = data.frame(ht,bntx$date)
ggplot(data_spot, aes(y = ht, x = bntx.date)) + geom_line(col = "#5a3196") + ylab('Conditional Variance') + xlab('Date')+ggtitle("Volatality plot for BioNTech SE") + theme_bw()

In our BioNTech SE Volatility plot we see the conditional variance fluctuating over time, with several noticeable spikes indicating periods of high volatility like 2020. These spikes likely correspond to specific events or releases that caused increased trading activity or uncertainty in BioNTech SE’s stock price. The periods between the spikes, where the variance is lower, represent more stable times with less fluctuation in the stock price.

15. Model Equations

The best performing model was GARCH(1,1). The model equation is as follows:

\(\sigma_{t}^{2} = \alpha_{0} + \alpha_{1}\epsilon^{2}_{t-1}+\beta_{1}\sigma^{2}_{t-1}\)

The best performing model was GARCH(1,1). The model equation is as follows:

\(\sigma_{t}^{2} = \alpha_{0} + \alpha_{1}\epsilon^{2}_{t-1}+\beta_{1}\sigma^{2}_{t-1}\)